Improved Katz Smoothi Modeling in Speech

نویسندگان

  • Genqing WU
  • Fang ZHENG
  • Wenhu WU
  • Ling JIN
چکیده

In this paper, a new method is proposed to improve the canonical Katz back-off smoothing technique in language modeling. The process of Katz smoothing is detailedly analyzed and the global discounting parameters are selected for discounting. Further more, a modified version of the formula for discounting parameters is proposed, in which the discounting parameters are determined by not only the occurring counts of the n-gram units but also the low-order history frequencies. This modification makes the smoothing more reasonable for those n-gram units that have homophonic (same in pronunciation) histories. The new method is tested on a Chinese Pinyin-to-character (where Pinyin is the pronunciation string) conversion system and the results show that the improved method can achieve a surprising reduction both in perplexity and Chinese character error rate.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...

متن کامل

Improved katz smoothing for language modeling in speech recogniton

In this paper, a new method is proposed to improve the canonical Katz back-off smoothing technique in language modeling. The process of Katz smoothing is detailedly analyzed and the global discounting parameters are selected for discounting. Further more, a modified version of the formula for discounting parameters is proposed, in which the discounting parameters are determined by not only the ...

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

Comparison of the Formalin- Ether and the Katz Kato in The Parasitological Diagnosis of Human Fascioliasis

ABSTRACT Human fascioliasis is an endemic disease in the Guillan province and it has became a public health problem in the port of Anzali. Thousands of people in this area were infected during the two outbreaks in 1987 and 1998. If the mature worm settles in the liver, finding the ova in the stool is the most reliable diagnosis of the infection. Therefore selecting the most sensitive parasito...

متن کامل

On enhancing katz-smoothing based back-off language model

Though the statistical language modeling plays an important role in speech recognition, there are still many problems that are difficult to be solved such as the sparseness of training data. Generally, two kinds of smoothing approaches, namely the back-off model and the interpolated model, have been proposed to solve the problem of the impreciseness of language models caused by the sparseness o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002